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ABSTRACT 

Cumulative  survival,  failure,  or  detection  probabilities  cannot 
in  general  be  precisely  estimated  from  truncated  samples  if  only  data 
grouped  in  successive  time  intervals  is  available.  Mathematical  models 
of  the  failure  rate  and  abort  rate  within  the  time  intervals  are  postu¬ 
lated  from  which  estimates  may  be  obtained  from  grouped  data  when  the 
models  are  valid.  An  easily  calculable  approximation  formula  can  be 
used  in  the  earlier  time  intervals  where  the  sample  size  is  relatively 
large.  This  can ' provide  data  for  verifying  or  rejecting  a  given  model 
prior  to  making  calculations  in  later  Intervals  where  the  smaller  sample 
size  would  otherwise  diminish  the  reliability  of  the  resulting  probabilities. 


i 

(reverse  blank) 


INTRODUCTION 


The  cumulative  probability  as  a  function  of  time  of  the  occurrence 
of  some  significant  event,  such  as  the  failure  of  an  element  or  the 
detection  of  a  target,  is  often  useful  in  describing  the  effectiveness 
of  a  system.  This  probability  may  be  estimated  from  data  giving  the 
times  of  occurrence  of  the  event  in  a  sample  consisting  of  a  number  of 
observations  of  the  operation  of  the  system  or  of  similar  systems . 

When  no  observations  in  the  sample  are  terminated  except  when  the  event 
of  interest  occurs,  the  sample  may  be  described  as  "uncensored"  or  "non- 
truncated."  In  such  cases  the  appropriate  estimate  of  the  cumulative 
probability  of  occurrence  of  the  event  to  time  T  is  the  fraction  of  the 
sample  in  which  the  event  has  occurred  prior  to  T,  reference  (a).  However^ 
if  the  sample  is  truncated  because  it  contains  aborted  observations  which 
terminated  prior  to  the  occurrence  of  the  event,  the  appropriate  estimate 
of  the  cumulative  probability  is  not  so  readily  obtainable  .  Methods  given 
in  references  (a)  and  (b)  are  not  rigorous  because  they  depend  on  a  deri¬ 
vation  of  the  expected  value  of  the  probability  at  the  time  of  the  first 
occurrence  of  the  event  which  excludes  the  possibility  of  abort.  The 
present  paper  suggests  other  methods  of  deriving  cumulative  probability 
estimates  from  truncated  samples . 

DEFINITION  OF  PROBLEM 

Find  the  cumulative  survival  probability,  P  ,  or  the  cumulative 

s 

failure  probability,  P  =  1  -  P  ,  as  a  function  of  time  from  data  given 

I  8 

in  the  form:  In  successive  time  intervals  At^  A^  =  -  t_^  _  ^  (tQ  =  0) 

N^  elements  were  present  at  the  beginning  of  the  interval,  i.e., 
at  time  t^  ^ 

r  elements  failed  during  the  Interval 

a.  elements  aborted  from  the  sample  during  the  interval  but  prior 
1  to  failing . 

Data  Is  usually  grouped  in  equal  time  Intervals,  but  the  At  '  s  need  not  be 

equal  in  this  formulation.  For  application  to  problems  involving  detection 
of  targets,  "non-detection"  and  "detection"  may  be  substituted  for  "survival" 
and  "failure"  respectively  in  the  above  definition. 
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CUMULATION  FORMULAS 


In  general, 


P  (t.  ) 
s'  i ' 


VL-Ai 


(i) 


with  P  (t  )  =  cumulative  probability  of  surviving  to  the  end  of  At  ,  and 
SI  1 

p  .  =  probability  of  not  failing  within  At. .  This  may  also  be 

S  X  1 

written  as 


,<v  = 


J=1 


■sj 


(2) 


Alternatively,  in  terms  of  failure  probability,  where 

W  =  1  -  W'  and  Pfi  =  1  -  PSi: 

pf<V  -  Ff<Vi>  +  [p  - 

is  equivalent  to  equation  (l)  and 

w  - 1 -ji^1  -  y 


to  equation  (2). 


(3) 

00 


NON-TRUNCATED  CASE 


If  there  are  no  aborts,  a_^  =  0  and  the  usual  definition  of  pgi 
applies : 


P 


si 


(5) 


For  a  set  of  data  which  is  not  truncated  -  r^_^  so  that 

substitution  of  equation  (5)  into  equation  (2)  gives 
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the  conventional  result  that  the  cumulative  failure  probability  is  the 
ratio  of  failures  to  trials . 

TRUNCATED  CASE  WITH  DISCRETE  DATA 

Another  special  case  of  the  general  problem  defined  above  arises 
■when  the  data  gives  the  exact  time  of  each  failure  and  abort .  In  this 
case  the  events,  both  failures  and  aborts,  may  be  ordered  chronologically 
and  t^  chosen  to  be  the  time  of  occurrence  of  the  i-th  event .  Then,  if 

N±-l  N  -i 

the  i-th  event  is  a  failure,  r^  =  1,  a^  =  0,  pgl  =  — ^ -  =  — -r— 

s  j_  wl” 1 

similarly,  r\  =  0,  a^  =  1,  pgi  =  1,  if  the  i-th  event  is  an  abort. 

Substitution  of  these  values  of  p  into  equations  (2)  or  (3)  gives  an 

S 1 

estimate  of  the  cumulative  probabilities  at  the  time  of  occurrence  of 
each  failure  or  abort.  A  computational  shortcut  is  available  when  a 
sequence  of  n  failures  is  uninterrupted  by  aborts  since  repeated  applica¬ 
tion  of  equation  (l)  shows  that 


F  (t. )  =  P  (t.  ) 

s  i  s  l-n' 


N, 


i 

i  +  n 


when  event  i  and  the  n  -  1  preceding  events  are  all  failures  .  This  might 
also  be  derived  from  consideration  of  equation  (5)> 
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TRUNCATED  CASE  WITH  GROUPED  DATA 

With  grouped  truncated  data  the  definition  of  p  ^  given  by 

equation  (5)  does  not  hold  unless  the  assumption  is  made  that  all  aborts 
occur  at  the  end  of  the  time  interval.  If,  on  the  other  hand,  it  is  assumed 
that  all  aborts  occur  at  the  beginning  of  At_^  the  equivalent  form  of 

equation  (5)  is 


P  . 

si 


(6) 


As  a  third  hypothesis,  assume  that  all  aborts  occur  simultaneously  some¬ 
where  within  the  time  interval,  so  that  r'  failures  occur  prior  to  the 
aborts  and  the  remaining  r.  -  r'  after  the  aborts.  Then 


(T) 


Thus,  the  value  of  p  .  depends  on  when  the  aborts  occur.  It  is  assumed 

S 1 

that  this  is  not  known  for  the  grouped  data  case.  Nevertheless,  it  is 
possible  to  place  limits  on  the  value  of  p  .  since  equation  (7)  always 

S  X 

gives  values  between  those  of  equations  (5)  and  (6).  Thus, 


N.-a.-r.  N.-r. 

ill  i  i 

N.-a.  —  ^si  —  N. 

ii  l 

or  alternatively 


(8) 


1  Pfi  < 


(9) 


MATHEMATICAL  MODELS  OF  BEHAVIOR  DURING  At 

Since,  within  the  limits  given  by  equations  (8)  and  (9)>  the  values 
of  the  survival  and  failure  probabilities  during  At^  depend  on  the  history 

of  the  failures  and  aborts  within  the  time  interval,  it  is  appropriate  to 
compare  the  results  which  arise  from  various  reasonable  assumptions  about 
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this  history.  Dr.  Joseph  H.  Engel  of  the  Operations  Evaluation  Group  .has 
proposed  the  following  model:  For  convenience  of  notation  define  a  new 
time  variable  9  =  (t  -  such  that  9  =  0  at  t  =  t  ,  the  begin¬ 

ning  of  At^,  and  9  =  1  at  t  =  t^  the  end  of  At^ . 

Let  f'(9)  =  the  unforestalled  failure  probability  density,  that  is,  the 
rate  of  failure  assuming  there  is  no  abort  mechanism  in 
operation 

w'(9)  =  the  unforestalled  abort  probability  density  (rate  of  aborts 
assuming  there  is  no  failure  mechanism  in  operation). 

Q 

Then  f(0)  =  J  f'  (s)ds  =  the  unforestalled  cumulative  probability  of 

failure  to  time  9,  and  probability  of  failure  within  At_^  is 

Pf  =  f(l)  (10) 

(The  subscript  i  is  omitted  here  and  below  where  it  is  understood  that 
only,  the  i-th  interval  is  under  consideration.) 

Then,  if  the  failure  and  abort  mechanisms  are  statistically  independent, 
it  follows  that  the  probability  of  failure  during  At±,  allowing  for  the 

probability  of  failure  being  forestalled  by  aborts,  is 


F  =  J0  [l  -  w(s)"Jf'  (s)ds. 


(11) 


Similarly,  the  probability  of  abort  during  At,  with  the  probability 
of  abort  being  forestalled  by  failure  included,  is 


W 


J*0  [l  -  f(s)]w’  (s  )ds 


(12) 


Then  with  N  elements  in  the  sample  at  the  beginning  of  the  interval  the 
expected  number  of  failures  (with  forestalling  by  aborts  accounted  for) 
is  NF.  This  expected  number  of  failures  may  be  set  equal  to  the  observed 
number  of  failures : 

NF  =  r  (13) 


! 
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and.  similarly 

NVJ  =  a. 

Equations  (13)  and  (l^+)  provide  unbiased  estimates  of  F  and  VJ . 

Exponential  Rates  of  Failures  and  Aborts 

Assuming 

f '  (0)  '='be"t9 


w' ( 9)  =  ce 


— c  9 


produces  from  equations  (ll)  and  (12) 


and 


F  = 


W  = 


b+c 


b+c 


[l  -  e-(»+=)] 


[l  -  „-<»«>]. 


Solving  these  simultaneously  with  equations  (13)  and  (ll)  gives 
( 

N 


b  = 


r+a 

0, 


loS~ 


e  N  -  r  -  a 


for  r  +  a  >  0 


for  r  =  a  =  0 . 


Then  from  equations  (10)  and  (15) 


{ 1 


r 

r+a 


0, 


,  for  r  +  a  >  0 
for  r  =  a  =  0 


or 


,  , ,  r+a  \  r+a  „  .  ^  „ 

P.  =  M1 - W~)  ’  for  r  +  a  >  0 


1, 


for  r  =  a  =  0. 


(1*0 

(15) 

(16) 

(17) 

(18) 

(19) 

(20) 

(21) 
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Constant  Rates  of  Failures  and  Aborts 


Assuming 


gives 


f (e)  =  h. 

(22) 

•N 

X 

II 

CD 
' — ' 

> 

(23) 

ll 

sr 

1 - 1 

H 

1- 

(210 

W  =  k  [l  - 

4-1 

(25) 

which  may  be  solved  simultaneously  with  equations 


(13)  and  (lV)  to  produce 


Other  functional  forms  could  be  postulated  for  f'(9)  and  w'(g)  and, 

as  long  as  they  involve  exactly  two  constants,  it  is  theoretically  possible 

to  solve  the  simultaneous  equations  (ll)  through  (l4)  for  these  constants 

and  thus  derive  expressions  for  p  and  p  as  above.  Since  the  two  sets 

X  s 

of  assumptions  on  failure  and  abort  rates  already  examined  are  as  reason¬ 
able  as  many  others  that  might  be  postulated,  it  does  not  appear  worth¬ 
while  to  pursue  this  approach  further  here.  However,  the  expressions 
(21)  and  (27)  are  somewhat  cumbersome  to  evaluate,  especially  in  the 
absence  of  computational  aids,  so  that  consideration  of  simpler  expressions 
approximating  these  equations  may  be  fruitful. 

Average  of  Limits  Approximation 

One  such  approximation  is  the  arithmetic  mean  between  the  limits  of 
equations  (8)  and  (9)*  These  may  be  written  as 
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■H- 


N-a-r  .  N-r 


N-a 


N 


L 


=  z[TT*TT?r] 


(28) 


(29) 


•f  ^  L  N  N- 

/ 

Average  Sample  Size  Approximation 

A  simpler  expression  from  the  point  of  view  of  computational  ease 
may  be  derived  by  substituting  a/2  for  a  in  equation  (6)  giving 


ps  = 


N - —  -  r 

n  -  -i- 


■f  N  - 


(30) 


(31) 


These  last  two  equations  may  be  thought  of  as  the  result  of  assuming  that 
the  average  number  of  elements  in  the  time  interval  is  the  number  at  the 
beginning  decreased  by  half  the  number  of  aborts . 

COMPARISON  OF  RESULTS  FROM  VARIOUS  MODELS 

Figure  1  shows  in  graphical  form  how  the  failure  probabilities 
derived  from  the  four  expressions  arrived  at  above  behave  as  a  function 
of  r/N,  the  fraction  failing  within  time  interval  At,  for  the  particular 
case  in  which  one-fifth  of  the  initial  elements  abort  during  At,  However, 
the  following  observations  apply  for  all  values  of  a/N: 

(a)  The  value  from  the  Exponential  Rates  Model  exceeds  the  p^. 

from  the  Constant  Rates  Model  from  r  =  0  to  r  =  a  and  falls 
short  of  it  thereafter . 

(b)  For  any  value  of  r/N  the  Average  Sample  Size  value  of  pf  is 

always  less  than  all  the  others.  Thus,  this  gives  a 
"conservative"  estimate  of  the  failure  probability. 
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Upper  Limit 

E(in(9) 


Constant  Rates 
Eqn(26) 


Average  of  Limits 
Eqn(29) 


Lower  Limit 

Kqn(9) 

Average  Sample 

Eqn(31) 


fraction  failing  within  At 


FIG.  1:  COMPARISON  OF  FAILURE  PROBABILITIES  DERIVED 
FROM  VARIOUS  MODELS 


(Fraction  aborting 


(c)  When  r/N  is  less  than  about  .25  and  a  >  r,  the  value  from 

the  Average  of  Limits  computation  exceeds  all  the  others . 

This  estimate  is  therefore  not  "conservative"  in  this  region. 

(d)  All  the  estimates  of  pf  considered  lie  very  close  together 
when  r/N  and  a/N  are  small . 

In  order  to  quantify  this  last  observation  the  maximum  absolute 
differences  between  the  Constant  Rates  or  Exponential  Rates  values  and 
each  of  tbe  Average  Sample  Size  and  Average  of  Limits  values  were  calcu¬ 
lated.  Figure  2  shows  curves  of  constant  differences  between  p^  (or  p  ) 

4-  S 

values  from  the  Exponential  Rates  Model  or  the  Constant  Rates  Model,  which¬ 
ever  is  larger,  and  the  values  from  the  Average  Sample  Size  formula.  From 
curves  of  this  type,  figure  3  was  derived  showing  maximum  absolute  differ- 

I*  4"£L 

ences  as  a  function  of  — - — ,  the  fraction  of  the  initial  number  withdrawn 

from  the  sample  during  At,  either  by  failure  or  abort.  For  values  of  this 
fraction  up  to  0.4,  using  the  most  easily  calculable  approximations, 
equations  (30)  and  (31)  will  produce  differences  no  greater  than  .0032. 
Since  probabilities  are  ordinarily  quoted  to  only  2  decimal  places  this 

approximation  will  usally  suffice.  When  — ■  exceeds  0.4,  the  limiting 

values  on  p  and  p  ,  equations  (8)  and  (9),  are  so  far  apart  that  the 

S  I 

confidence  interval  on  an  estimate  from  any  model  would  be  large  unless 

y«-j.Q 

the  model  could  be  verified.  Ordinarily  this  value  of  — ^ —  will  be 

exceeded  only  In  later  time  intervals  At^  when  the  sample  size  has 

become  small.  At  this  point  one  could  plot  the  cumulative  survival 
probabilities  already  obtained  and  also  plot  cumulative  non-abort  prob¬ 
abilities  derived  from  formulas  analogous  to  equations  (2)  and  (30): 


1 


P 

na 


(t.)  =  n 

J=i 


Pnaj 


(32) 


P 


na 


(33) 


If  these  plots  appear  to  fit  a  straight  line  on  semi-logarithmic  paper 
the  assumption  of  Exponential  Rates  of  Failures  and  Aborts  is  appropriate 
and  one  may  proceed  confidently  with  that  model.  This  method  is  illus¬ 
trated  in  the  following  numerical  example. 
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|  Constant  Rates  -  Average  Sample  Size  |  , 

|  Exponential  Rates  -  Average  Sample  Size  | 


max  of 


FIG.  2:  CURVES  OF  CONSTANT  DIFFERENCE  BETWEEN  VALUES  OF 
or  pn  CALCULATED  FROM  CONSTANT  RATES  OR 


EXPONENTIAL  RATES 


Maximum  Difference 


NUMERICAL  EXAMPLE 


Columns  (3),  and  (5)  of  table  I  give  hypothetical  data  for 

equal  consecutive  time  intervals  of  length  T: 

IL  is  the  sample  size  at  the  beginning  of  the  i-th  interval; 

r.  is  the  number  of  failures  within  the  interval; 

1 

a.  is  the  number  of  aborts  within  the  interval. 

i 

Column  (6)  gives  the  empirical  probability  of  surviving  to  the  end  of 
the  interval  on  condition  of  being  present  at  the  beginning  of  the 
interval.  These  are  calculated  from  equation  (30)  except  in  the  cases 
indicated  by  asterisks  where  r  +  a  >  O.^J-N  .  In  these  cases  equation 

(21)  is  used.  Column  (j)  gives  the  cumulative  survival  probability  to 
the  end  of  the  i-th  interval  obtained  from  equation  (2).  Column  (8) 
gives  empirical  probability  of  not  aborting  within  the  interval  on 
condition  of  being  present  at  the  beginning  of  the  interval  obtained 
from  equation  (33)*  Column  (9)  gives  cumulative  non-abort  probability 
from  equation  (32). 

Because  r  +  a_  >  0.4No,  P  (t. )  and  P  (t. )  for  i  =  1  to  8  were 
y  y  y  S  1  na  1 

plotted  as  shown  in  figure  4  to  validate  the  Constant  Rates  of  Failure 
and  Abort  Model  before  proceeding  further  with  the  calculations . 

Figure  5  shows  the  fit  of  the  resulting  P  (t. )  points  to 

S  1 

Po(t)  =  e  where  m  is  the  mean-time-to-failure  derived  from  the  orig¬ 
inal  data  by  the  following  method. 

mean-time-to-failure 

In  the  discrete  case  where  exact  time  of  each  failure  and  abort  is 
known,  the  mean-time -to-failure  (MEF)  is 


(34) 


where  t .  is  the  time  of  the  i-th  event  with  r .  =  1  and  a .  =  0  if  this 

1  IX 

event  is  a  failure  or  =  0  and  =  1  if  the  i-th  event  is  an  abort . 


Z<ri +  ai)*i 
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-I  . . 


TABLE  I 

illustrative  numerical  example 


(1) 

(2) 

(3) 

(»0 

(5)  . 

(6) 

(7) 

(8) 

B 

i 

t  . 

1 

N. 

1 

r . 

1 

a . 

X 

■^si 

P 

rnai 

s 

1 

T 

100 

15 

10 

.842 

.842 

.892 

.892 

2 

2T 

75 

12 

7 

.832 

.701 

.898 

.801 

3 

3T 

56 

6 

6 

.887 

.622 

.887 

.710 

4 

4T 

44 

7 

3 

.835 

.519 

.926 

.657 

5 

5T 

34 

7 

1 

■  791 

.411 

.967 

.635 

6 

6t 

26 

3 

'  4 

•  875 

.360 

.837 

.531 

7 

TT 

19 

2 

3 

.886 

.319 

.833 

.442 

8 

1 

8T 

14 

3 

0 

.786 

-.251 

1.0 

.442 

t 

1  OX 

91 

11 

4 

1 

.616* 

.155 

- 

- 

1 

1  10 

j 

j  10! 

\  8 

j  0 

2 

1.0 

•  155 

- 

1 

*  11 

j 

|  11T 

LI 

J  3 

1 

0* 

0 

J - - 

_  “ 

•X-  Calculated  from  equation 

r  4  a .  >  0  ■  4N .  ■ 

1.  1  -L 


(2l)  rather  than  equation 


(30).  because 
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Reference  (a)  uses  this  formula  also  for  data  grouped  in  time 

intervals  At .  with  t .  the  time  at  the  end  of  the  i-th  interval  and  r 
1  i  i 

and  a^  the  number  of  failures  and  aborts,  respectively,  within  the 
interval.  For  equal  time  intervals,  At^  =  T,  equation  (34)  can  be 
written  in  the  more  convenient  form 


m' 


(35) 


where  m'  indicates  that  this  is  only  a  first  approximation  to  MTF  for 
data  grouped  in  equal  time  intervals.  This  estimate  is  generally  too 
high  because  it  assumes  that  the  sample  size  N_^  at  the  beginning  of  the 

i-th  interval  persists  throughout  the  interval.  Assuming  exponential 
rates  of  failure  and  abort-,  a  correction  factor  may  be  derived: 

Let  probability  of  failure  by  time  t  be 

Pf(t)  =  1  -  e't;/m  (m  =  MTF)  (36) 

and  probability  of  abort  by  time  t  be 

Fa(t)  =  1  -  e‘t/u  (37) 


with  u  =  mean-time -to- abort .  Then  probability  of  withdrawal  from  the 
sample  by  either  failure  or  abort  by  time  t  is 


Pw(t)  =  Pf(t)  +  Pa(t)  -  Pf(t)Pa(t) 


=  1 
=  1 


-t/w 

e 


(38) 


where  w,  the  mean-time-to-withdrawal  from  the  sample  is  found  from 


(39) 
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It  follows  that  the  number  which  have  not  failed  or  aborted  at  time,  s, 
where  s  =  t  -  t._1  so  that  s(ti_]_)  =  0  and  s(t±)  -  T,  is 

N(s)  =  Ni  e'S//w. 

Then  the  average  sample  size  within  the  i-th  time  interval  is 


_  1  f  N.  e-S/*ds 

N.  =  -sr  Jo  1 


’i  T 


Substituting  IT  for  N.  in  equation  (35)  gives  a  better  approximation  to 


MTF  : 


m  =  w 


(l-  ^Shj. 

^  i 

Yj 


The  estimate  of  w  to  be  used  in  this  equation  is  obtained  from  an 
equation  analogous  to  (35 ): 

t2n. 

1  1 

W  =-y— 

L  (r±  +  B.±) 

1  y  y 

For  the  numerical  example  considered  above  Zj  bL  -  389?  m  rj_  82, 
la  =  38  so  that  w'  =  3.891,  m'  =  6.2?T,  and  m  =  5.5^T. 
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